OPERATOR GUIDE TO THE PLURIBUS IMP 


<<Version = 1200 Patch 7>> 


This is a general guide to running the IMP for people who are not IMP 


programmers. Familiarity with Pluribus hardware and general systems 


concepts is assumed. 


Some of the commands described in this document require that 


the site have the override capabilities enabled on their Node. The 


site operator must call the NOC to obtain this capability before they 


can use these commands. 


10.1 INTRODUCTION 


Because the new version of software (PSE 1200 Patch 7) does 


not support DDT, without NOC intervention, a diagnostic operating 


system was created to help diagnose IMP problems. This diagnostic 


must be run with the IMP off-line to the network. Before loading 


in the diagnostic operating system, the modem connections to the 


IMP must be uncoupled. The diagnostic software is not compatabie 


with the operational version of IMPSYS. If the modem connections 


are not uncoupled, and the diagnostic system is loaded, the IMP 


w111l not come up. 


The commands for the diagnostic software are the same as 


those for the operational version of IMPSYS. These can be found 


in the preceding pages of thisS manual. Some commands need the 


OVERRIDE capability set, this is true with the diagnostic software 


also, but it is not required that NOC intervention is needed to 


obtain the OVERRIDE capability. With this diagnostic software, 


ail functions of IMP DDT can be accomplished without NOC intervention. 


10.2 BUILDING VHA BY HAND 


A. Turn OVERRIDE ON <cntrol-o>. 


B. VHA Locations 


2,4A20/ SERIAL NUMBER OF VHA 
2,4A22/ LENGTH OF VHA TABLE 
2,4A24/ ALWAYS ZERO (0) 
2,4A26/ VHA 1 ALWAYS ONE (1) 
2,4A28/ VHA 2 

2,4A2A/ VHA 3 

ETC. ETC. 


Starting with location 2,4A28 the VHA would be entered as: 


i 


Host Port .@) IMP = 5 


<2,4A28/> 0 <O05> would be VHA 2 


tt 
0) 


Host port IMP = 21 
<2,4A2A/> 0 <616> would be VHA 3 


All locations must be entered as hexadecimal numbers. 
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LOCAL DISPLAYED INFORMATION 


1.4 CRT DISPLAY 


The top part of the terminal screen displays hosts and 


status and traps that have been sent to the NMC. 


Format (all in hex): 


Bxxx Bxxx Bxxx Bxxx Bxxx Bxxx Bxxx Bxxx Bxxx Bxxx Bxxx 
Dxxx Dxxx Dxxx Dxxx Dxxx Dxxx Dxxx Dxxx 


TRAP# PROC# COUNT REGi REG2 REGS REG4 REGS REGGE REG7 


modems 


Each status word for hosts/modems consists of 4 digits of 


coded information as explained below: 


MODEMS STATUS 


| Digit | Description | Normal Value 
| 

| 4 | I/O Bus Currently Being Used | 

| leftmost | C = Modem Interface on E-bus | D (F bus) 

| | D = Modem Interface on F-bus | 

| 2 | Modem interface number | O thru 7 

| | Interface LOOP Status | 

| | O = NOT LOOPED | 

| 3 | 4 = INTERNALLY LOOPED | © (NOT LOOPED) 
| | 8 = EXTERNALLY LOOPED | 

| | Interface (line) Status | 

| 4 | © = UP | oO (UP) 

| | 41 = DOWN | 

| | F = NEVER EXISTED | 


HOSTS STATUS 


| Digit | Description | Normal Value | 
a 
| 4 | I/0 Bus Currently Being Used | | 
|leftmost | A = Host Interface on E-bus | B (F bus) | 
| | B = Host Interface on F-bus | | 
| 2 | Host interface number | O thru: 42 | 
| 3 | ----- NOT USED | Always O | 


ome em een mean ee ae Oem oem te a ee Gm ae em ae ee ee ee eS He ae een Ge A Gee ome ae em em ee em ee ow en ae Ge a ee ame mm ce a am om eNews Be me ee ne a ae oe ee em mm cm we ae 


Interface (host) Status 


4 O = UP O (UP) 
= DOWN 
3 = 


Bb 
It 


IMP SOFTWARE NOT INITIALIZED 
or HOST RECEIVED QUIT 


| 
| 
| 
| 2 = TARDY 
| 
| 
| 
| F = NEVER EXISTED 


| 
| 
| 
| 
NON EXISTENT | 
| 
| 
| 


Ves CONTROL PANEL 


OPERATOR PANEL: The standard display on the operator panel of a 


running IMP is: 


REGISTER LIGHTS - not relevant 


ADDRESS LIGHTS - a multiplexed display, which usually displays 
which processors are running the system by blinking the bit 
corresponding to a given processor. The lowest numbered 
processor is bit #0, so for a system with processors 12, 13, 32, & 
33, the assignment would be bit#O=P12, bit#1=Pi3, bit#2=P32, 
bit#3=P33. Processors which are discovered (bus couplers exist) but 
which are not running the system fully are represented by a 
non-blinking bit position. If either processor on the bus with the 
operator panel is running STAGE, the number of the stage being 
run is displayed by the bit number of the left-most bit which is 
Ort. Thus, if a processor is running stage 5, bit 5 and all bits to 
the right of it would be off. If one processor is running stage 
and the other is running the system, the blinking processor display 
would be overlayed on the stage running display. Lines are 
displayed in address lights 15-8. Line 1 = bit 15, and a bit off 
indicates a line declared up by the IMP. A line displayed as up in 
the lights will typically not be deciared up by the network for 


another minute or so. 


DATA LIGHTS - usually display what hosts are up and in a running 


system. Hosts are displayed by bits 15-0 in the same manner’ as lines 


are. A bit off indicates that a host is declared up by the IMP. 


Thus, a display of 3F1F in the data lights indicates that hosts O, ie 


8, 9 and 10 are up. When the reloader/dumper is running, the 
current address is displayed in the data lights. this will count if 
a reload or dump is in progress. The Stage running will be 


displayed in the address lights. 


2 a ie RLD CARD LIGHTS 


The RLD card has three lights arranged vertically on the front edge 
of the card. The top light flashes each time the special reload 
header is detected coming into any of the modems connected to the RLD. 
The middie light indicates that at least one such header has’ been 
detected since the last time the bus was preset. The bottom light 
flashes each time a reload packet with correct checksum is 
detected and used to cause an appropriate bus transaction (store 
to memory, write a register, etc). When a reload using the RLD card 
is in progress (and being successful), the top and bottom lights 
will flash together, and the middle light will stay on. If only the 
top light flashes, it indicates that the checksums are bad on the 


received reload packets (due possibly to a flakey line or interface). 


1.4 PID CARD LIGHTS 


The lights on the edge of the PID card indicate the highest pid 


value which has been written to the card and read yet. Its main 


value ina running system is to provide an indication of 1/0 bus 


activity. Thus, if the processors connected to the console are both 


halted for some reason (and therefore not driving the control 


panel), some indication of life in the system can be gotten by 


looking at the PID cards. 


2 RUNNING IMP DDT 


2.1 IMP PAGE TYPES 


The IMP thinks of parts of itself as having type numbers. These 


parts are called pages (logical pages, as opposed to the physical page 


numbers for chunks of core). These are assigned as follows: 
Type (Hex): Page Name 

) Reliablility Code 

2 DDT Code 

4 Warm Code 

6 Fake Host Code 

10 Spare Reliability Code 
12 Spare DDT Code 

14 Spare Warm Code 

16 Spare Fake Code 

20 IMP Variables 

22 IMP 2nd Variables 

30 IMP Buffers page 1 


32 IMP Buffers page 2 


2.2 DDT COMMANDS 
Numbers 


The new DDT works solely in hexadecimal. The radix commands 
<esc> 0, H, OD and the radix specifiers "’" , "}" , "." have all 
been removed or applied to other purposes. Another character ($) is 
now used to specify decimal input rather than hexadecimal and is the 


only exception to the hexadecimal rule. 


Commands 

x,y/ This is the basic examine command of DDT to return the 
contents of a memory location. There are two cases of this 
command depending on the value of Y. If Y is a local address 
(ie Y is less than 4000) then X is the mask of processors 
whose memory is to be examined (this means that the answer 
returned will be from one of the processors specified by the 
mask X). A special case is a negative mask value and sets 
the processor mask to be all those that are known to exist 
to stage BD. In the other case, when Y is greater than 4000 
and therefore a reference to common memory, iyt then specifies 
the map setting to use in the reference. TA tHe case X can be 
either a logical page (x < 200) or a physical page (x is odd 


or x > 200). 


x/ This is a simplified case of the above and does an examine 
of the address X using the last processor mask specified if 
X iS a local address or the last map specified if X is a 


common memory address address. 


***(The following command requires the OVERRIDE CAPABILITY) *** 


x,y<cr> Carriage return is used to insert new values into memory and 


close the location currently examined. X and Y will be inserted 


into the current location and the next location respectively 


if the current location is still open and then the location is 


closed. A location iS open when it has been examined but not 


closed with a carriage return or jinefeed. 


***(The following command requires the OVERRIDE CAPABILITY )*** 


x<cr> 


<cr> 


<1f> 


<del> 


This is just the one argument form of the above and stores just 


one number into memory 


This is the no argument form of the above and stores nothing 


but closes the location. 


Linefeed closes the current location and examine the next location 


Space adds two arguments together and the result becomes one 


argument. eg "x y/" will open the location at x+y. 


Delete (=rubout) will zero all current input and will restore DDT 


to the state it was at the last typeout. 


of typing the current address. It can no longer be used to specify 


a decimal number. 


Dollar sign is now used to say that the number just typed in is 


a decimal number. 


Typeout 


All typeout from DDT are four digit hexadecimal. 


Examine Formats 


a,b/x  y 


This is the usual format of an examine. If a,b iS a 

local memory reference, then only x will be printed. Note that 
tne processor that did the reference is no longer specified. 

If a,b iS a Common memory address then x and y are the 

contents of the main and spare pages if they differ. If some 
kind of error occurs in the reference then the x and y are 
replaced by the appropriate error messages for the corresponding 


-~ 


pages. 


Error Messages 


There are two formats of error messages for the two situations 
where an error can occur. A system wide error can occur causing 
Stage to put the system into the stand-alone DDT mode (if enabled 
by DEBUGM). In this situation the bit of the processor reporting 
the error is printed first followed by the error followed by the 
location of the error. The second type of error is a store or 
read error as a result of a DDT reference. In these cases the 
error is typed first followed by a number specifying the mask 


of processors that failed the reference. 


Errors 


QUIT 


NX 


FRMT 


TO 


TL 


FADE 


The location referenced by DDT resulted in a QUIT or an unexpected 
QUIT occurred in the running of the system. In the system QUIT 

the address returned by DDT is not the address of the instruction 
producing the QUIT but the address of the O01 Trap specifying 

the quit. The location of the QUIT can be found in the snapshot area 


in the Stage variables area. 


Bit returned, a non-existent memory code, as a result of some DDT 


reference. 


BLT returned a format error in response to some DDT reference. 
This usually means that DDT had set up parameters improperly for BLT 
or that a reference to a nonexistent processor was made. 


Timeout- BLT took too long to complete a reference and aborted 


An attempt was made to execute a non-instruction 


The halt all processors trap was encountered in the running of the 


system 


SPECIAL FEATURES: 


FC,X °C "Crosspatch to a Node X’s tty. 


°D "CNTL-D - COMPLEMENT VALUE OF THE ON/OFF SWITCH 


TO EITHER ALLOW OR DISALLOW TRAPS ON THE terminal. 


L "CNTL-L" - ECHOS A “L (FORM FEED) TO THE TTY. 


"CNTL SHIFT P" UNDOES CROSSPATCHS AND RESUMES TALKING TO DDT. 
(Tnis feature will be useful when users are using the 


crosspatch to talk to another Node in the Network) 


Di aK 2K OK >i 2K Kook ok aK aK 2K 2K 2k OK 2K 2K ok aK ok ok 2k ok akc ak ake ok ok 2k ok 2k ak bie ak ok 3k ak ak OK ak ak ok 2k ok ok ak ak ak ak 2k 2k ok 3k oi 3k ak ok ok dK a aK Di OK ok ok OK ak ok 


The site user must notify the NOC before they can use the 


following command. This capability is required to perform 


some of the privileged commands in this document. 


2K OK OK OK 9K 2k 2K 2K OK ak ok dk pk ak 2 9K 2k ok oe 2k a ok ak ok ok ok 2k 2k ok ok ok ok 2k 2k ok ok ok 2k 3K ok ak 3K oi ak oie ak ak ok ak ok ak ak ok ok ak ok ait 2k ok ok aK 2k 3k ok ok 2k ok 2K ok 


me "CNTL-O - COMPLEMENT VALUE OF THE ON/OFF SWITCH 


TO EITHER enable or disable the override capability. 


#NH 


#NS 


#NR 


#PH 


#PS 


#PR 


<ESC> C 


ce THE OPHELP COMMANDS 


nice stop and halt forever 


nice stop and restart 


nice stop and reload 


The "nice stops" cause the IMP to observe all network 
protocols before going down (send IMP going down 
message, turn off hosts before modems, etc.). Nice 
stops shoud be used whenever possible if the IMP is on 
an operational net to avoid perturbing the rest of the 
net. After a ‘halt forever’ stop, DDT may be started 
by pressing RESET and ATTN, or by 308 (RO=308, R8=FCOO, 
RUN, from the console), provided the system is in DEBUG 


mode. 


panic stop and halt forever 


panic stop and restart 


panic stop and reload 


Panic stops do exactly that, and are a good way to 
quickly cause the machine to halt or reconfigure (or 
reload) either when there is no network to worry about, 


or when things have to happen in a hurry. 


clear illopr tables: zeroes all trap information saved 


in common memory 


*** (The following commands in this section require OVERRIDE CAPABILITY)*** 


A#HU unloop host # 

AHL loop host # at interface 
Host looping and unlooping changes will be visible 
immediately in the console lights (in contrast to 


modems), unless the host is broken. 


AMU uniloop modem # 


#ML loop modem 4 at interface 


Looping or unlooping a modem will result in some 
combination of 5Ci, or 5C4 traps being generated. 
Looped state will not be visibie in the lights until 
the line is declared up by the IMP (1-2 minutes). 
Modems are numbered from O to 7 and correspond to 


switch settings from 15 to 8 on the interface. 


#HB return address for param block of host # (0,1,...) 


#MB return address for param block of modem #4 (0,1,...) 


The address is typed out and DDT is left in the same 
condition it would be in if you had typed the address 
in. Therefore, to open loc +20 in the parameter block 
for host 2, type "2,HB" and then " 20/". (see 


description of parameter blocks. ) 


KI 


return address of iokill 


useio and iokill are tables which describe what I1/0 


to ignore and not ever try to discover under any 


circumstances (iokill). 


Both tabies are structured in the same way: 


Each table consists of 4 words, each of which controls 


16 possible device addresses (one per bit) starting 


from a corresponding base address in a table called 


jobase. 

iobase E100 
E200 
F100 
F200 


The least significant bit (=bit O or the "4 bit") 


selects the base address; since each device occupies 


10H locations, the device at the "2 bit" (or bit 1) 


selects the device at <base+10H>. 


Therefore, to iokil?l a device at E220, turn on the "4 


bit" (bit 3) in the second word in iokill (base = E200) 


by loading an 0004 into it. To iokil? FICO, turn on 


the 1000 bit of word 3 in iokill (by entering 1000). 


To select F100,F120,FiCO,F1EO, add all the bits 


together (0001+0004+1000+4000+8000) and load a DOO5 in 


word 3. 


Use of the features: useio is useful both as 
information about what the system has discovered, and 
as a means of forcing the system to switch to the other 
interface of a doubled pair. This is accomplished by 
removing the bit for the member of the pair currently 
being used. The removed interface will be rediscovered 
by the system and reentered in the tabel in about 1 
minute. Adding bits to useio may cause the system to 
go through stage (since if the device doesn’t exist we 
wilt get quits) and is NOT recommended in general, but 
can speed discovery of devices that do exist. Removing 
F devices in doubled M/I machines may cause the system 
to stage because of the asymmetry of the m/i-m/i path. 
iokill provides the mechanism for permanently removing 
devices from the system. Setting a bit in rere) kills 
the device and causes the corresponding bit to be taken 
out of useio. Turning off the bit causes the system to 


be able to discover the device again. 


RELOADING 


4.1 RELOADING USING CASSETTE 


All cassette tapes are set up to load and start the Imps without 


operators’ intervention. To load from cassette, place the cassette 


tape in the reader and depress the RESET and LOAD buttons on the 


operators’ panel. The Imp will automatically startup after the 


cassette tape is read in and rewind to load point. 


4.2 RELOADING USING THE RELOAD CARD 


Since this operation is only initiated by the NMC using a NMC NU 


system, there is nothing to do at the site. The RLD card was 


meant to be used when it is not possible to get help in reloading 


a dead IMP. It is generally more reliable and faster to use 


IMPSYS cassette tapes. 


MANUAL RESOURCE SWITCHING 


24 PROCESSOR CONTROL 


*xk* This section requires the OVERRIDE capability. id 


Turning processors on and off 


There are several ways to change a processor’s interactions with 
the system. All involve setting the processor mask bit (the bit 
which the processor would blink on the control panel) in one of 
four words in common memory. These words and their effects are 


described below. 


prokil O,40C6G This causes the specified processor not to be 
restarted if he stops. Furthermore, the system 
will not try to discover whether his control 
register (R15) is there. This allows machines to 
run in split mode without interference (reading 


R15 can halt a running processor). 


prohng 0O,40C8 This will hang the specified processor at a late 
stage. He will participate in checksumming, 
etc., but not in running the IMP (or blinking his 


bit). 


ampman 0O,40CA Ampman is a block of three words allowing various 
kinds of manual amputation. Prohit and proamp 
are the second and third words in the block. The 


first word (ampman) is copied by each processor 


prohit 0,40CC 


proamp 0,40CE 


into its buskil word (see below. This is the 


preferred method of removing common buses. 


Setting a processor’s bit in prohit also causes 


it not to be started. Also, if it is already 


running, it will halt itself cleanly (may take 


1-2 minutes). 


Proamp causes the whole processor bus to be 
amputated by disabling forward transfers in its 
bus couplers. This is the positive way to turn 
off a processor who may be causing damage to the 
system. (but both processors on .the bus get 


turned off when you do this.) 


ee MEMORY CONTROL 


There are two ways to turn off memories in the IMP. The first is 
by setting bits in a block called memkil to turn off individual 
4K pages, and the second is by amputation of a whole memory bus 
using a word calied buskil. [ [MEMORY KILLING USING buskil NOT 


IMPLEMENTED YET] ] 


memkil O32A (4 word block) Setting a bit causes the 
corresponding 4K memory segment to not be used. 


Correspondence is as follows: 


memk i] - O (bit 0) to 1E0OO (bit 8000) 
memkil+2 - 2000 - 3E0OO 
memkil+4 - 4000 -5E0OO 


memkil+6 - 6000 - 7EOO 


(so to turn off tne 4200 and 4400 pages, set memkil+4 to O006) 


buskil 


Bit assignment: 


busk i] 


0328 


0328 


5.3 BUS CONTROL 


word for removing busses (type -: to DDT first to 


change all locals) I/O busses (NOT memory busses 


yet) may also be turned off by setting the 


appropriate bits in a word in local memory called 


buskil. The procedure is described below. 


Since 


ampman (see above) is copied into buskil, set the 


bits into ampman first (see above). 


The bit assignments for buskil are: 


O - ibit - EOOO bus 


1 - 2bit - FOOO bus 


2 - 4bit - O memory bus 


3 - 8bit - 4000 memory bus 


word for removing busses. (Type -: to 


first.) 


DDT 


useio and 


5.4 DEVICE CONTROL 


iokil Turning devices on or off iS usually done by 


setting or clearing bits in the useio or iokill 


blocks. The KI ophelp command returns 


the starting addresses of the useio and iokill 


blocks. 


6. 


HOST AND MODEM CONTROL 


6.1 LOOPING AND UNLOOPING 


*** This section requires the OVERRIDE capability. *** 


Changing the looped state of modems or hosts is done by use of 


the appropriate OPHELP command: 


#HU - unloop host # 


#HL - loop host # at interface 


#MU - unloop modem # 


AML - loop modem # at interface 


6.2 SWAPPING DOUBLED INTERFACES 


There are two ways to swap to a spare interface for machines with 
separate M and I busses, but only the first of these will work with 
M/I machines: The first way (works for ail types of machines 
with double interfaces) is by setting the mask bit of the device to be 
killed into the appropriate word in the jokill block. The 
starting address of the iokill block is gotten by use of the "KI" 
OPHELP command (see Section 3). Note that this is the ONLY way to 
den on M/I machines, since the program tries very hard to swap back 
to an F bus interface if one exists. The second way is to delete 
the mask bit of the device to become the Spare from the appropriate 
word in the useio block (get the starting address of useio by 


the "KI" OQOPHELP command, as explained in Section 3.) 


6.3 HOST PARAMETER BLOCKS 


Parameter blocks are where the state of each logical interface 
(doubled is one logical interface) is maintained by the program. The 
#HB OPHELP command is used to find the starting address of the 
parameter block # (see Section 3). Interesting entries: (see the 


listing for a complete description) 


OOiF - (Clow byte of OO1E) Host State. 
O > up 
1 - ready line down 
2 - tardy 
3 - nonexistent 
4 - IMP software not initialized 


OOOE - Transmit pid (hardware switch settings) 


OOOF - Receive pid 


©0020 - Interface address (on I/0 bus) (if this address doesn’t 
look anything like an I/0 address, you may have gotten 
into the parameter block of one of the fake hosts or a 
VDH) This is the address of the interface currently being 


used, if it is doubled. 


0022 - Spare interface address (on 1/0 bus). Or O if there isn’t 
an acceptabie spare. [NOTE - SUCCESSFULLY SWAPPING 
INTERFACES CAUSES THE PREVIOUS SPARE TQ APPEAR AT 0020, 


AND THE PREVIOUS MAIN TO BE AT 0022] 


O03C - Host throughput counters. These are a block of 8 


throughput counters which are sent to the NMC and then zeroed 


host every minute. They are: 


3C - internode messages host-to-IMP 


SE - internode messages IMP-to-host 


40 - internode packets host-toIMP 


42 - internode packets IMP-to-host 


44 - internode messages host-~to-IMP 


46 - intranode messages IMP-to-host 


48 - intranode packets host-to-IMP 


4A - intranode packets IMP-to-host 


002C - Host dead subcodes. This has the reason why a host went 


-~ 


down. Or 0 if the host is up. see 1822 manual. 


OO2E - IMP number for this host. This is of interest if the IMP 


uses multiple IMP numbers. 


6.5 MODEM PARAMETER BLOCKS 


The start of the modem parameter biock for a given modem is 
gotten by the #MB ophelp command. Remember that modem numbers”) start 
with 0, and the IMP adds 1 to the device number set into the 
Switches of the modem to ensure that this iS SO. Interesting 


locations (displacements into the block): 


0002 - line state word. This word contains the line state in the 
jeft byte and -a count- in the right byte. The bits in the 


left byte have the following meanings: 


100 master bit: on if we are higher # imp on this line or 


this line is hard down 


200 ---line is down and in software reset 
400 line up bit 

800 heard a hello 

1000 heard a hello-up 

2000 take line down if this is set 

4000 send a helio 


8000 send routing 


If on a normal line that is up, bits 800, 1000, 4000, 8000 
should be flashing, perhaps imperceptibly. Bits 100 (if 
our imp number is higher than our neighbors) and 400 


should be solidly on. 


The right byte is a counter whose use varies and whose 


value is useful mostly to the guys. 


0004 - neighbor on logical line in upper half and old neighbor 


in lower half 


O006 - checksum error count This is the count of 


OO00C 


OOOE 


OOOF 


0020 


0022 


hardware-detected errors which are seeeunedree be checksum 
errors. These errors do not directly trigger any kind of 
trap, and may result from any kind of problem between the 
front end of the transmitting modem interface and the 
front end of the receiving interface. These errors are 
seen anytime the modem is reset, such as a line going down 


and then up. 

logical modem number (right byte of OOO0C) 
transmit pid (switch setting) (left byte) 
receive pid (switch) (the low byte of OOOE) 


interface address (on I/O bus). This is the address of 


the interface currently being used if it’s a double. 


spare interface address (on I/O bus). Or O if there isn’t 
any acceptable spare. [NOTE - SUCCESSFULLY SWAPPING 
INTERFACES CAUSES THE PREVIOUS SPARE TO APPEAR AT 0020, 


AND THE PREVIOUS MAIN TO BE AT 0022] 


6.6 HOST TESTING 


At the moment, the only ways to test a host consists of looping the 


host at various places using either internal software loops or 


external looping plugs. The NMC can then run their host testing 


procedure via NMC NU and return results. This of course requires 


the imp to be up and running on the network. 


Cies MODEM TESTING 


Modem testing done at the site is limited to variously looping 
interfaces cables or modems and seeing if the imp considers them good 
enough to declare up. For lines that are intermittently bad, 
patches can be installed by software guys to count packets or errors 


over periods of time so a performance evaluation can be made. 


pe RUNNING MESSAGE GENERATOR 


**k* This section requires the OVERRIDE capability. nc a 


Message generator is a routine which runs as a fake host on the 
local IMP and sends data in selectable length messages at 
selectabie rates to a selected host(s) on a selected IMP. The 
starting address of the parameter block is obtained by using location 
6,5e66.. Entries in the block are as follows: (adresses are 


displacements into the block) 


0O - length in words. Initialized to be 8. Set to max of 1F7, 


or negative for torture test (see below) 
% 


O2 - first leader word - don’t change unless you know what you 


are doing. Initialized to be OFOO. 


O04 - second leader word - don’t change. Initialized to be 0 


OG - third leader word - destination host #. (Or set to OFF to 


send to discard. Initialized to be OFF. ) 


O08 - fourth leader word - destination IMP #. Initialized to be 


ourself. 


OA - fifth leader word - Initialized to be O. Set to 3 for raw 


packets. 


oc - sixth leader word - don’t touch. Initialized to be Oo. 


OE - control word. O = off, 1 = on. Initialized off. Set to 1 


to run it. 


10 - frequency. Set to 0 to go as fast as possible. Set to 1 to 
send a message every 25 msec. Shift 1 left one place to 


half rep rate. (So setting a 2 in gives 50 msec rep rate, 


4 = 100 msec, 8 = 200 msec...etc.) Initialized to be 1. 


(see Section 11.C.) 


If the IMP # is set to be us, no traffic goes out over modem 


lines. 
"Torture test" ss The torture test 18 a special case of message 
generator and sends messages of standard lengths to four 


specified HOSt-IMP pairs. The four Host-IMP pairs are stored in 
table STATDT (6,5E48) and have simply the format Host, IMP, HOST, IMP 
eae, The torture test is specified by setting the first word in the 
message generator biock (length) negative (8000 or greater). 
The lengths used are 8,72, 496 and 504 words for Host-IMP pairs 1 
through 4 respectively. The frequency at which a message is sent is 


the same as in other message generator stuff. 


8. UNDERSTANDING THE STAGE SYSTEM 


8.4 A BRIEF DESCRIPTION 


The stage system is a basic chunk of software upon which the 


Pluribus IMP system is built. The stage system can be thought of as 


fulfilling two purposes: in one sense it iS an initialization module 


that passes configuration data to the imp, eliminating the need for 


configuration data to be loaded into each machine; In addition, 


stage maintains a watch on the hardware and software in use, thus 


acting aS a reliability module for the imp. Tn the initialization 
sense, STAGE is the first module to run when an IMP is (re)started. 
The STAGE software is broken into nine sequential modules, or 


stages, each responsible for determining the useability of some set 


of hardware or software. The stages are run on a round-robin 


basis, with the requirement that no stage can run unless ali the 


a 


preceding stages are also running. Each stage runs and if 
successful, it enables the next stage. If any stage fails to come an 
acceptable conclusion, it disables all the stages following it and 


reruns until it is happy with the system. When all the stages have 


been enabied, the IMP system is Started and uses the configuration 


information provided by stage to build machine dependent tables, 


parameter blocks etc. while the IMP system iS running, the 


stage system is also running all its stages at background level. 


Changes in the system, such as the loss/gain of memory or I/0 


devices are detected and subsequently can be fixed by this mechanism. 


In Fixing something, a processor does not just change 


configuration information arbitrarily. One processor may see 


things differently from others on the machine, and so in order to have 


a coherent system, the processors in the system must decide in unison 
before making a significant change. The implementation of this 
consensus mechanism is simple; each stage has a consensus word on 4a 
common communication page, and each processor wanting to change 
something adds (IORsS) its bit to that word. The consensus concept 
then allowS a processor to change something if, by adding its bit, 
the consensus word matches the word containing bits for all the 
processors known to exist. In this way, any adjustments are done 
only by the last processor to join the consensus and therefore with 


the approval of the other processors. 


Local 


clokrt 


gquitrt 


uillop 


ujyiffy 


ugquitd 


uquitp 


stime 


stim2 


oldp 


myproc 


procbt 


procno 


maprel 


8.2 USEFUL STAGE VARIABLES 


STAGE variables of interest: 


50 - number of bad RTC reads (answers differ by >300 


micro sec) (zeroed by 2D trap) see lIclock below. 


56 -number of successful quit-retries (got a quit the 


first time, but ok the second) (zeroed by 2C trap) 


78 - last F-illop 


8E - location of last program-in-a-1loop 


S80 - got unexpected quit trying to look here 


94 - address of place that did reference 


AO - local copy of system time (sytime) 


A2 - high order time, 27, S96min/tick, 51, 5O0tick/day 


A4 - last pid dispatch 


AG - processor name, this proc (coupler address and odd 


A& - processor bit (the bit he would blink) 


AA - processor number, this proc 


BO - map for RELY page 


mapddt 


mapcod 


mapfak 


mapvar 


mapv2 


wstage 


wais 


iclock 


B2 - map for DDT page 


B4 - map for WARM page 


BG - map for FAKE page 


DO - map for VARS page 


D2 - map for V2 page 


188 - what stage running 


18C - stage control word - bits on disable stages. 
(same as address lights). <A way to tell where a proc is 


hung in stage. 


1A4 - address of the RTC this processor is using for 


timing various STAGE things. 


8.3 STAGE DIAGNOSTICS - what a processor is complaining about 


If a processor is stuck in STAGE, the address lights (for the 


processors connected to that bus) or wdis will tell what stage the 


processor is stuck in. The last bit number off to the left is the 


highest stage number which has been entered (and is the one we are 


stuck in). 


Currently the stages are: 


O - LK - Local memory Kernel checksum 


1 - MD - common Memory Discovery 


2- RK - Reliability page Kernel checksum 


3 - BD - common Bus Discovery 


4 - CD - processor Coupler Discovery 


5 - RC - Reliability page Code checksum 


6 - LC - Local memory Code checksum 


7 - MM - common Memory Map management 


8 - ID - I/0 device Discovery 


9 - AR - Application Reliability and initialization 


The operational system 


Possible causes of being stuck in various stages: 


O - LK - a bad local kernel checksum here will cause a halt. We 


may also not be able to see any RTC (or maybe no 1/0 busses) 


1 - MD - We will hang here if we see less memory that the system 


(put not if we see more). We may also have a bad common 


memory pointer. 


2 


- RK - We were not able to find a common Kernel, or had a 


different idea about it in that the system. 


BD - We could hang here by getting a quit from the VARS page 
(for instance because of a memory coupler failure, or actual 
memory failure). It could also be that the bus discovery 
answer is changing, or that we have an I/O coupler failure of 


some sort. 


CD - We could hang here if our coupler tables are bad 
(compared to the system, of course), or because BBC is broken 
to one processor. If the prohit bit for us is set, we will 


come to this stage and halt. 


RC - Hanging here says we need a reload because the RELY page 
checksum is smashed. (RELY is in common, and usually on the 
lower numbered memory bus, if we have both busses and have 


had time to stabilize. ) 


LC - this says local is smashed, and we need toa reload 
(reload the system, if no other processors have good locals) 
hangs in both stages 5 and 6 are characterized by a number 
displayed in the data lights (for the console processors) 


which cannot be cleared. 


MM - We hang here because of a broken common memory code 
checksum, waiting for a reload or we can hange here if the 
cmap table is bad, or if the typeword of a page changed, or 
if we want to reload common. Previous comment about data 


lights applies here also. 
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ID - we can hang here because our useio table is bad (we can 
see less I1/0 than others). We will also hang here if our 


mask bit is set in prohng. 


AR - We are here because we want to initialize and are 
waiting for enough processors to run the system (currently 


two for IMPs). 


= TRAPS 


9.1 TRAPS - GENERAL 


Traps report unexpected and expected error conditions to the NMC and 


the local terminal. Each trap consists of a trap 
identification number , a word containing the bit(s) of the 
processor(s) reporting the trap, a count of how many were 
reported, then registers 1-7 at the occurance of the first trap of 


this kind. The trap reporting mechanism in the imp saves up to 


eight different traps in a table on the first variables page. 


The terminal display is taken from these tables and is displayed at 


the top of the screen. There are two modes to the screen 


display: if the location of the screen cursor is at the top left 


(very beginning) of the screen, then up to 8 traps will be 


displayed. If the cursor is anywhere else, then only 3 traps are 


displayed. allowing more room for data printout while debugging. 


Generally a site terminal should be left in the first state, to 


allow max imum information to be on the screen. In case of IMP 


failure, site personnel can report the traps to the NMC. To 


leave the cursor at the top left, either use CR to get to the 


beginning or more simply type aéiecontrol-L to DDT which will 


re-write the screen and leave the cursor in the proper place.) The 


processor mask is the logical OR of all processors reporting the 


trap, with each processor encoded as a bit postion, lowest numbered 


processor = bit 0. The actual mechanism for implementing traps in 


the program is through the illop (illegal operation) 


self-interrupt in the processor. Normal traps are caused by 


executing an illop in the code of the form Exxx, and Fxxx, 


Traps sent to the NMC are in the format: 


TRAP# PROC MASK COUNT REG14 REG2 REGS REG4 REGS REGS 


traps sent to the terminal screen are in the format: 


TRAP# PROC# COUNT REG1 REG2 REGS REG4 REGS REGG’- REG7 


TRAP# PROC# COUNT REGi REG2 REGS REG4 REGS REGG- REG7 


TRAP# PROC# COUNT REG1 REG2 REGS REG4 REGS REG6' REG7 


The seven register reported to the NMC and the terminal screen 


give lots of information about the cause of the trap; e.g. device 


number, or line #. The trap description tells which registers are 


useful. 


REG7 


ee A GUIDE TO THE TRAPS 


The following are descriptions of some of the more serious 
(expecially from a hardware point of view) of the Pluribus traps. A 
complete list of tne traps, together with their hex locations, is 
given in section 9.3. 


HEX DESCRIPTION TRAP OF TRAP # 


1 unexpected quit 


Any quit trap represents a quit which was retried once and 
was solid. To find where the quit came from, use DDT to 100k 


at the snapshot area (see above). 


STAGE RESTART 


2 program in a loop 


Means that the processor didn’t get through LOOP by the time 
it was timed out (some strip ran too long). We checked to 
see if were hung on a lock and weren’t. This trap will work 


in STAGE too. 


Timeout varies depending on what strip was running (range is 
2O0msec to i50Omsec for timeout) Saves: (UJIFFY) E4 - p.c. at 


the time strip was timed out (at jiffy time) 


--CAUSES STAGE RESTART 


3 Completed Memory Management 


This trap occurs in Stage MM if any page swapping occurred. 
This is expected on startup or restart but if it persists it 
implies that the IMP is constantly changing memory useage for 


some reason. 


-~-NO STAGE RESTART 


local clock stopped 


This says we got two successive jiffies and the reading on 
the RTC hasn’t changed. The response is to pick a new RTC. 
For snapshot traps: R2 - has the clock we are now using (the 


one we just switched to) 


~-FULL STATE RESTART 


Local Kernel Checksum Broken 


This says the most sacred part of local STAGE is busted. 
Since the desease is quite fatal, a processor with this 
trouble may not be able to report it before halting. If the 
checksum breaks while the processor is running he should be 
able to trap and then hnait. To tell what the trouble is, put 
the proc bit in PROKIL (so as not to restart him), turn off 
his bus reset timer and verify his local memory (by comparing 


with some other processor). 


-~-PROC WILL TRAP AND HALT, OR MAYBE JUST HALT 


unexpected interrupt 


This says we wether got a level 2 or level 3 interrupt, or a 


level 1 which wasn’t a remote power fail. This can happen 


because of a "reset-attn" 


BBC MAP FAILURE 


This means that a processor can’t agree with the rest of the 
system about what common memory exists, and also can’t get 
the right to fix it (because the consensus doesn’t agree with 
him.) If a memory bus goes down in an n-processor” system, 
(n-1) processors will report this trap. The nth processor 
makes the consensus ok and the memory table gets fixed, since 
the algorithm is that the processor who sees the most memory 
is right. This trap will also occur for things like bad bus 
couplers. 

The bus in question will almost always be the 4000 _ bus. If 
only the O mem bus is on, the trap will] not occur except 
under strange circumstances. This is because the memory 
table is kept in the lowest numbered memory (communications 


page), and if that page breaks, different traps occur. 


-~-NO STAGE RESET 


No PIDs in system 


Means no PID cards answering (either really gone, or 
configured out.) The trap happens late enough that we have 
already found an RTC somewhere. (only one PID gone will 
result in the devices on that bus being configured out. If 


this is suspected, use OPHELP commands to see what devices 


are thought to be present.) 


~~FULL STAGE RESTART 


adjusted comrel 


This says we moved a page of common on which we were running 
the higher numbered stages, and is likely when we get a 
broken checksum in either Kernel or the page which was moved. 


This will happen continuously when the Rely kernel is broken. 


-~~USUALLY CAUSES IMP RESTART 


Jiffy clock stopped 


This should only be reported by even processors and says that 
at least 1 sec. has elapsed without the jiffy interrupt 
having updated its copy of the RTC reading. It could also be 
due to the RTC or something in between, since the trap will 
be sent if the current RTC reading is more than 10000 (1 
second) decimal different from the jiffy reading. (The jiffy 


reads the RTC every 1/60 sec) 


-- [NO ACTION] 


system missed a tick 


This says that the system failed to update the SYTIME word in 
common since the last time the reporting processor ran stage 


(128 msec). 


The cause iS not immediately obvious. Could be that we 


stopped getting pids from the RTC, or have a broken memory. 


-~-ACTION = COUNTS SYTIME BY 1 


Quit in checksum parameters 


Interrupt recevied from the system during checksum rouitne. 


Quit during checksumming 


D traps may accompany 3 traps, but a D trap will not always 
cause a 3 trap, since the quit might be in some other code 


during checksumming. If snapshots are turned on: 


Ri- map if quit in common memory 


RG6 - loc+2 of the quit 


All references in common mem will be through map 1. 


-~- NO STAGE RESTART 


Local code checksum broken 


This is similar to a 5 trap but for the code page in local. 
Since it happens later in stage, it implies that the Kernel 
checksum is ok. (Local checksumming covers covers all 
constants in local except for the Kernel.) If other 
processors think their local is ok, they will start a block 
transfer (software) to the unhappy one, = and he will 


participate in it. 


If all processors are unhappy with themselves, they wil? ask 


for a reload from the net. 

(D traps may also occur with this one) 
--TOTAL RELOAD OF LOCAL 

Not enough memory to run system 


This can happen as a transient if MEMKIL is used and the new 
usable memory is too small. There is a remote possibility of 
1 processor not seeing enough memory but still being able to 
run a late enough stage to produce the trap. The picture in 


local of what the processor sees in common: 


[MYSEGS] ist word 0O-1E00 
(block) 2nd word 2000-3E00 
3rd word 4000-5E00 


4th word 6000-7E00O 


(This cauire is bit coded by page recognized within the 
specified range. Thus for the first word, bit 1 on means the 
200 page on O memory. If bit 1 of the 3rd word were = on, it 
would correspond to the 4200 page on the 4000 memory. Bits 
are cleared periodically and then reappear as the processor 


discovers memory pages. ) 
Stage variables area quit 


This trap results from the next procedure after finding the 
Kernel variables area, which is to scan the whole area =for 


quits and ciear the whole area if we get one. The procedure 
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12 


assumes it iS parity oquit. Consensus iS necessary, S0 a 
number of these traps will occur for multiprocessor 
operation. The program attempts to use the same area again, 
and a quit, while clearing the area, will cause the processor 
getting the quit to voluntarily stop using the page via 


MEMKIL. 


ACTION: WHEN CONSENSUS IS ACHIEVED, STAGE VARIABLES ARE 


REINITIALIZED, PRODUCING A 12 TRAP 


lost our communications page 


This means that the page we were previously using for 
communications gave a quit on ae write. The sequence 
preceding the trap is that we got a quit on a read of SYTIME 
during checking of the page, and then tried a write into the 
same location which also got a quit. (The assumption being 
that it was a parity quit the first time.) If the quit was 
fixed ona rewrite, stage proceeds normally and doesn’t give 


any indication. 


The normal cause of this trap is O Memory disappearing (but 


not if configured out) 


~~FULL STAGE RESTART 


Stage Common Reinitialization 


This says that the timer which says whether stage variables 


are current got too old (a watchdog timer), and can happen as 
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14 


14 


a result of a 10 trap. Timout time = 10-15 sec. Usually 
only triggered by a cold start. timer is: 


STGIM - 0,5D56 


~-ACTION: PARTIAL STAGE RESET (already done by the time this 


trap produced) 


Memory coupler quit 


Coupler discovery got a quit referencing a memory bus coupler 
(BCM). The most likely cause is an old-style BCM on a parity 


memory bus. 


~-THE COUPLER IGNORED 


HUNG ON BAD LOCK 7 

This happens if: the jiffy code sees that 

it’s time to do a 2 trap, but first it checks for a lock 
pattern in the code, and if it finds one, it calculates the 
address of the lock and checks for validity. To be valid, a 
lock must have either a map 1 or amap 3 address range. An 
invalid address range causes a 14 trap. 

~-FULL STAGE RESTART 


CAN’T FIND A CLOCK 


System is unable to find the real time clock. 


R3 - address of real time clock 
R7 - bus used 


-~-FULL STAGE RESTART 
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17 


18 


19 


1A 


remote power fail interrupt 


Getting one of these causes us (even proc) to first try to 
stop our buddy, then wait 1 sec and try to restart him at the 
beginning of stage, then restart us at the beginning of 


stage. 


~~FULL STAGE RESTART 


Can’t find an RTC 


This happens in the routine which looks for an RTC and is 
started if the main RTC stops, or if our previous idea of 
where an RTC might be was wrong (for instance if we are toid 


not to use an RTC on start-up). 


-~~FULL STAGE RESTART 


BBC processor started 


This is reported by the processor doing the restarting. To 
find out who was restarted, check the snapshot: R2 - address 
of the coupler of the restarted processor. If R2 is odd, 


then the odd processor was the one who was restarted. 


~-RESTART OF THE TARGET PROCESSOR 


Buddy Processor started 
This reports that this processor was successful in restarting 


the other processor bus. 


Bad processor identity 


1C 


1D 


This says that we found our own coupler address and had the 
wrong idea about who we were before. The algorithm for 
finding ourselves is to see if a given coupler address 
answers ‘2100’ on all busses. We then try writing the same 
password as we last wrote to that coupler in I/O space, and 
Ve it quits, it’s us. This trap would also happen if we 


never discovered our couplers. 


-~FULL STAGE RESTART 


Block Transfer Timout 


This says that the job of BLT is not progressing fast enough. 
One possible cause is the system believing a processor is ok 
enough to participate in his own BLT of local (to him) when 


he really isn’t running. 


~-FREES BLT FOR NEW USE 


BLT proc not in table 


This says that we can’t find the BLT target processor in the 


table of coupler addresses cleaned in an earlier stage. Not 


finding it stops the BLT and causes the trap. No normal IMP 


activity is likely to trigger this currently, but things like 


DDT or reloading might. 


--STOPS BLT PROCESS (NO STAGE RESET) 


Non-existent proc in BLT?? 


This can’t really happen for hardware reasons. Getting it 


implies a program bug. 


--STOPS BLT PROCESS 


1E No I/O bus for BBC 


Before doing BBC, we test the path a little by doing a store 


through the BBC window. If we get a quit (20 trap), we retry 


through the other I/O bus, and if that doesn’t work either, 


we give a 1E trap. This trap implies that we can at least 


read the processor’s control register (since we had already 


discovered him.) We try to halt the processor before BEC, 


and quit from the write to his control register will also 


cause this trap in the same manner. 


--STOPS BLT IN PROGRESS (may change this later) 


iF One of my couplers is broken 


This occurs in the processor discovery code and says that we 


got a quit trying to read a coupler whose name matches our 


name. We can probably run okay, though the trap will 


continue to happen from time to time. 


Snapshot: 


R2 - Proc # we were looking for. 


R7 ~ address of coupler. 


~-KEEPS TRYING, BUT PROBABLY WON’T JOIN SYSTEM 


20 BBC transfer failure 
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22 


23 


24 


This is caused by a quit during either a read or a write 


through a BBC window. 
Snaps: 
Ri - BBC map 
R3 - value stored 
R6 - I/0 bus plus index into BBC window (0-6) 


R7 - 1/0 bus BBC was done through 


-~-TRY TO USE OTHER I1?70 BUS. IF ALL FAIL, GIVE A 1E TRAP. 


Power restore interrupt 


we (the even guy) got a level 4 dev 2 interrupt. 


-~-HALT (US AND OUR BUDDY) 


Local power fail interrupt 
The processor detected that it was restarted after power 
was off. 


--FULL STAGE RESTART-- 


Tlilegal level 4 interrupt 


We got a level 4 interrupt which wasn’t devi, 2 or 4. 


-- IGNORE 


Stage variables memory failure 


Interrupt received from system while clearing common memory 


Variables. 


-~-CAUSES STAGE RESTART 
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26 


27 


Spare code page’s checksum disagrees 


This is not a hardware trap. Primarily it is a reminder to 
someone patching the code page that maybe he hasn’t patched 
both copies. (trap will occur about once every 1.4x (# of 


pages in use) sec. in each processor. 
--NO ACTION 
Fixed bad memory parity 


The stage (stage MC) which checksums common memory also does 
a loop to read the rest of common (outside the checksummed 
area). A quit during this test causes us to try to write O 
to the location, and if successful, will cause this trap and 
also zero the cell on the page which may tell] the system to 
reinitialize if appropriate. The trap may trigger various 
kinds of restarts. 
For snapshot: 

Ri - page (map setting) 

R6 - location of quit (always through mapi) 


-~-MAY REINITIALIZE. MAY TRIGGER VARIOUS RESTARTS 
Solid memory parity error 


This results from the same loop as the 26 trap above, but 
says net we got a quit and couldn’t fix the location (either 
we got a quit on write, or got a quit when we read the loc 
again, or failed to reread the O we wrote there. ) 


Snapshot: (same as above) 
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2A 


Ri - page (map setting) 


R6 - location of quit (always through map1) 


The processor getting the trap stops using the page and does 
a full stage restart (and may drop out of consensus if he is 
the only one seeing the problems.) This process is intended 
to help remove processors writing bad parity from the system, 
but may have trouble doing that unless the bad parity writes 
are relatively frequent or solid. If all processors see ne 
failure, then all will stop using the bad memory. 


~-STAGE RESTART 
No useable common memory 


This says we haven’t found the communication page (the lowest 
numbered page) on any memory bus. It could happen with 2 bad 
BCMs, or aS a result of some error in switching cables, 


cards, etc., or if the memory went away. 


--HANGS IN STAGE 1 (if no other proc is running and holding 


off a timer, we will reinitialize and retry about every 1-3 


minutes if we see any memory at all. 


Quit in quit handler 


Entering the quit handler for the first time sets a flag. If 
the flag is already set, we make a 2A trap. This trap can 
happen as the result of a software bug if the entry flag is 


initialized the wrong way when a processor is restarted after 


2B 


2C 


2D 


being stopped. 


-~-CAUSES STAGE RESTART 


Quit on instruction fetch 


This happens in the same loop as the 2A trap. If the quit 


target is within 4 bytes of the saved quit P.C., we call it 


2B trap. 


~-CAUSES STAGE RESTART 


Quit retry(ies) succeeded 


This trap occurs if we got a quit which succeeded on a second 


try at the instruction. This seems to happen a lot, and is 


indicative of some sort of design problem. Unfortunately, we 


can’t notice where we retried in general, though a special 


patch might be able to do it. 


Snaps: 


Ri - number of retries 


-~-Since retry worked, program continued normally 


RTC read retry(ies) succeeded 


This trap occurs if we got inconsistent readings on two 


successive reads of the real-time clock. The readings are 


allowed to differ by 3 (300 microseconds). If they differ by 


more, we count a counter, which is then reported later by 


this trap. This trap probably indicates some sort of design 


bug in the hardware. 
Snaps: 


Ri - number of retries 


~-Just retries until it succeeds 


2F Copy/Cleared Failed 


Unable to copy/clear address located in register 4 of TRAP. 


40 Start pointer write failed 
Bad start pointer given to routine to start the I/0 input 
routine. 
Ri - address in error 
R3 - Length of message/packet 


R5 - Device page used 


44 4 start pointer failures 
System software made more than 4 attempts to write the 1/0 
buffer to the address specified in Ri. 
Ri - address failing 
R3 - length of the message/packet 
R5 - device 
42 got illegal pid value 
indicates that the routine all ready exists ot that that area 
in the BASE table is inuse. 
Ri - pid level 
R2 - address in BASE table 


43 map error 


44 


45 


50 


co 


{00 


1014 


102 


system software received error while verifying the maps for the 


software. 


LSTACK overflow 


Pointer to addresses in maps exceeded limits. 


INBASE failed 


Error received while attempting to initialize the BASE tabie. 


software watchdog timer expired 
Timer exceeded limit. Generates an interrupt to keep software 


from looping. 


tty changed psb 


TTY initialize routine wither restarted or didi a reset due 


to quits received by the systems software. 


R4 - psb 


IMP reinitialization 


This means what it says. Nothing much about the previous 


state of the machine remains after the reinitialization, but 


some clues may be gotten from previous’ traps, if they are 


still around. 


-~-FULL REINITIALIZATION 


smashed a buffer pointer 


The table of buffer pointers (starting address and number of 


bytes, etc.) has had an entry smashed.. The buffer (or packet 


is lost). 


Changing buffer page allocation 


Reported as a result of the Node reconfiguring after a device 
failure in a M/I bus machine. System is now using spare device. 


Notify maintenance. 


108 Main ctock has stopped 


109 


This says that the main RTC didn’t tick for three ticks (25 
msec ticks) of the backup clock. It also says that the other 
clock became the main clock. The F clock is always preferred 
for the main clock if it exists CLOCK - 6238 - address’ of 
current main clock. 


--STARTS USING OTHER RTC (has already started by trap time) 


Backup clock working again 


This trap says that we had one clock (either one) and now 
have two. If the F clock was the one that came back, the 


system starts using it as main clock instantly. 


10A No working backup RTC 


10B 


This trap happens when the backup RTC goes away for any 
reason. The trap will reoccur every 27 minutes until the 


backup reappears. 
IMP number invalid 


This says that the IMP number set into the switch register on 
either or both RTCs is zero or too big ( >67). The most 
likely cause is that somebody forgot to set it when the RTCs 


were last installed, but it could also be a flakey switch or 


10C 


associated logic on the RTC. We don’t currently check for 
different IMP numbers in the two RTCs (unless one is zero) 
and will always use the IMP number on the EOOG RTC when two 


are present. ((this aspect will be changed) ) 


--THE IMP WILL COME UP (NOT TO THE NET) BUT WON’T RUN WELL 
(DDT WILL BE INACESSIBLE FROM THE terminal) AND WE WILL KEEP 


TRAPPING. 
RTC gone away 


This says that we ran through the bus discovery code in stage 
and removed the dispatch for one RTC from the table (as a 
result of not finding it). This trap will not occur if both 
RTCs have the same pid settings (having them be the same is 
now ok, but not such a hot idea). This trap may occur along 


with a 1iOA or 108 trap. 
--NO ACTION (will cause KI trap) 
Modem hardware gone away 


(FOR 110, 111, 112, 113, 114, 115 TRAPS:) These traps are all 
the result of the IMP reconfiguring (the table changes in 
STAGE, but doesn’t trap then, and this is the result of the 
IMP noticing the ed (We will only give one of the 
three types of traps for each interface unless two halves) of 
an interface disappear at the same time. ) 

Snapshots for trap 110: 


Ri- address of device gone away 


--STOPS TRYING TO USE INTERFACE (may also cause 204 trap) 
Host HARDWARE gone away 


SEE ALSO TRAP 110 
Snaps: 


Ri - address of device gone away 
-~-~STOPS TRYING TO USE INTERFACE (may also cause 204 trap) 
Spare modem interface disappeared 


SEE ALSO TRAP 110 
Snap: 


Ri - address of device gone away 
--NO ACTION 
Swapping modem interfaces 


SEE ALSO TRAP 110 


Ri- new interface we are using 
~-SWAP 
Swapping host interface 


SEE ALSO TRAP 110 
Snaps: 
Ri - address of guy who went away 


R3 - address of new guy we are using 


--SWAP 


203 Over 2 interfaces, one device 


204 


205 


This happens if we find at least three devices with the same 
device type and device number. 
Snaps: 

R5 - device we’re trying to put in the dispatch 

R3 - 2nd of the 2 devices we’ve already found with the same 


type and number. 


-~-USE THE FIRST TWO INTERFACES FOUND 


Removed bad base dispatch 


This results from a 10C, 110, or 111 trap, generally. 


-~-CLEARS ENTRY IN BASE DISPATCH 


Pid for doubled interface differs 


This says we found two devices with the same type and device 
numbers but differing pid settings. This could happen for a 
variety of hardware reasons (flakey pid and device no 
switches or operator error are the most common). 
Snaps: 

R5 - address of second device to be found 

R4 - parameter block for first device 

R3 - receive pid for second device 


R6 - transmit pid for second device 


--USE THE FIRST OF THE TWO INTERFACES DISCOVERED 


207 Dynamic blocks area full 


This says there are too many devices for us to handie. The 
maximum number varies from assy to assy, and is currently 40 


devices (single or doubled) 


~-SOME DEVICES WON’T GET USED (the trap will keep occurring) 


301 Detected VHA table error. 


The system has detected that either a virtual host 
entry is not assigned to a physical host, or that 
more than one virtual host number is assigned to the 
same physical host address. 

Snaps: 


R3 - Contains VHA (in HEX) which is being 
complained about. 


- - HOST TRAFFIC MAY BE LOST - TABLE MUST BE CORRECTED 


302 INVHA: No Virtual address found. 

The system has detected that no physical address exists 

in the VHA table for a virtual host being referenced. 

Snaps: 

R3 - should contain the VHA number. 

Note: Can also be caused by PLULOG ON with no dest. host to ge to. 
303 TSKVHA: No Virtual Address this source. 

The system has detected that it has received host traffic 

from a physical host port that has no virtual address assigned 


to it. 


Snaps: 
R3 - Contains the physical IMP number. 


R5 - Contains the physical host port number. 


304 Too many VHA numbers. 
The system has detected a virtual host address that exceeds 
the table length. It is treated as being invalid. 


--EXAMINE VHA TABLE AND MAKE APPROPRIATE CHANGES--~- 


305 VHA IMP Number too big. 
The system has detected a virtual host physical address 
that has an IMP number greater than current maximum. 


__EXAMINE VHA TABLE AND MAKE APPROPRIATE CHANGES-~-- 


3CO IMP going down 


This trap occurs when we enter the nice-stop sequence after 


either a ANS or #NR command. 
3Ci Got setup for no dest 

This trap says one of our neighbors is asking to be reloaded. 
3C2 Flushing reload packet - No room 


This trap says we lost a reload packet we were trying to send. 


to a dead neighbor. 


401 Modem bad end pointer 


This applies to the receive end pointer only (so far). It 
says that we checked the receive end pointer after we got a 
receive pid and it was either less than the beginning of the 
buffer we were using, or past where we told it to stop. 
Check after evey modem input. 
Snap: 

R4 - address of parameter block 

RG - address of the buffer (through MAP2) 

(map in MAPSAV area) 


Ri - length that the hardware said it gave us. 
~-FLUSH THE INPUT AND RETRY. RESETS THE INTERFACE 
402 Modem got a quit 


This applies to receive only. (we don’t currently get much 
information out of this one) 
snaps: 
R4 - address of parameter block. 
R5 - device address 
Ri - er of input that the hardware told us. 
(receive end pointer) 


R2 - contents of status register 
~-FLUSH INPUT AND RETRY. RESETS INTERFACE. 
403 Modem input too short 


This says the packet was less than the minimum size that the 


net should ever give us (OA bytes). 


A2EA (VARS/Vars) Ccliklok: Lock on RTC counters 


ASB6 (VARS/Vars) ita: task queue lock 

A3C4 (VARS/Vars) free: free buffer list 

A3SC6 (VARS/Vars) freend: end of free buffer list 

A3SC8 (VARS/Vars) nf: size of shared buffer pool plus minf 
A4A4 (VARS/Vars) lockro: routing send buffers lock 

A4AG (VARS/Vars) cycle: timeout clock counters 

A4A8 (VARS/Vars) trniok: free transaction blocks lock 

A4AA (VARS/Vars) ” messt: message number timeout non-lock 

A4B2 (VARS/Vars) ringlk: restarter ring lock 

A4D4 (VARS/Vars) tcgo: host wakeup lock 

A4D8 (VARS/Vars) tbkgo: back host wakeup lock 

A506 (VARS/Vars) stolok: slow timeout lock 

A550 (VARS/Vars) conliok: configuration lock 

A5BO (VARS/Vars) rmlock: (and every 020) rev mes block locks 
A930 (VARS/Vars) tmiock: (and every 020) xmit mes block locks 
ACBO (VARS/Vars) reas blk lock (and every Hi0) 

AE44 (VARS/Vars) Fake O DOZE lock 

AE46 (VARS/Vars) Fake O WAIT lock 

AEC6 (VARS/Vars) Fake 1 DOZE lock 

AEC8 (VARS/Vars) Fake 1 WAIT lock 

AF48 (VARS/Vars) Fake 2 DOZE lock 

AF4A (VARS/Vars) Fake 2 WAIT lock 

AFCA (VARS/Vars) Fake 3 DOZE lock 

AFCC (VARS/Vars) Fake 3 WAIT lock 

BO38 (VARS/Vars) back host O (back5) lock 


BO58 (VARS/Vars) back host i (back7) lock 


BO78 (VARS/Vars) back host 2 (backS) lock 
BoS8 (VARS/Vars) back host 3 (back6) lock 
BOCO (VARS/Vars) hi host lock fake O 

BiOE (VARS/Vars ) ih hardware lock fake 
B1iC (VARS/Vars) ih software lock fake 
Bi3O (VARS/Vars) hi host lock fake 1 

Bi7E (VARS/Vars) in hardware lock fake 
B1i8C (VARS/Vars) ih software lock fake 
BiAO (VARS/Vars) hi host lock fake 2 

B1iEE (VARS/Vars) ih hardware lock fake 
BiFC (VARS/Vars) ih software lock fake 2 
B210 (VARS/Vars) hi host lock fake 3 

B25E (VARS/Vars ) ih hardware lock fake 3 


B26C (VARS/Vars) ih software lock fake 3 


B2A4 (DISPLY/Vars) dsplok: display variables lock 


B2C6 (ROUTE/Vars) spfrtl: Lock on common SPF tables 


B2C8 (ROUTE/Vars) rutliok: Lock on routing processing 


B3C6 (VHA/Vars) vhalok: Lock on VHA inverse translation tabie 


BD52 (STAGEK/RelVars) Common Bus Discovery Consensus 


BDSE (STAGEK/RelVars) processor and bus coupler discovery consensus 


BD68 (STAGEK/RelVars) bbclok: lock on bus coupler states 


BDBA (STAGEK/Re1Vars) bitlok: Block transfer lock 


BE2A (STAGEK/RelVars) Consensus for Rely page Checksum 


BESO (STAGEK/RelVars ) Consensus for Local Checksum 


BE38 (STAGEK/RelVars) memory configuration consensus 


BES2 (STAGEK/RelVars ) consensus for I/O discovery 


BEAG (STAGEK/RelVars) initialization consensus lock 


BEAE (PKCORE/RelVars) pkclok: lock on packet core parameters 
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645 


646 


647 


650 


658 


E68 1 


682 


683 


684 


68A 


68C 


68D 


68E 


68F 


630 


69 1 


692 


698 


699 


69A 


6A0 


6C2 


6CG 


6C7 


6C8& 


6CA 


6DO 


(WARM/LCode) 


(WARM/LCode ) 


(WARM/DDTCode) 


(WARM/LCode ) 


(FAKREL/FakCode ) 


(FAKREL/DDTCode ) 


(WARM/DDTCode) 


(FAKREL/DDTCode ) 


(FAKREL/DDTCode) 


(WARM/LCode ) 


(WARM/LCode) 


(WARM/LCode) 


(WARM/LCode) 


(WARM/LCode ) 


(WARM/Warm ) 


(WARM/LCode ) 


(WARM/LCode ) 


(WARM/Warm ) 


(WARM/LCode ) 


(WARM/LCode ) 


(WARM/LCode ) 


(WARM/Warm) 


(WARM/LCode) 


(WARM/LCode ) 


(WARM/LCode) 


(WARM/LCode) 


(WARM/LCode) 


3178: 


SOBC: 


4E74: 


3O5E: 


55A0: 


5480: 


Be TAs 


54A0: 


BA4CE: 


2B54: 


2D18: 


28B4: 


2AD4: 


289C: 


BAIA: 


2952: 


29A2: 


5DB2: 


ZABO: 


2AD0: 


296C: 


5CD8: 


28A2: 


28B8: 


29B2: 


SO5A: 


3524: 


Got msg with illegal pkt code 13 
No trnbik for inc RFNM 

no trnblk for inc query 

Got a duplicate Allocate 1 

Bad local Host in message biock 
Flushing an old trnblk 

recovered an old reas block 
requeueing trnblk for IH 
trnbdik/tmb1k mismatch 

host blocked awaiting free buffer 
host blocked awaiting mes num or blk 
host blocked awaiting all8& 

host blocked awaiting task 

host blocked requesting al18 
host blocked awaiting trnbik 
host blocked middie of 8-pkt 
Task blocked jaechonete message 
Bad rm bik for mes on host q 
ovtsk: lost buffer in hi2tsk 
back blocked awaiting task 

error during host input data 

bad buffer on host queue 
clobbered hisp requesting ali8 
hisp clobbered in packet 

bad hisp for bad message 

bad trnoik buffer 


bad buffer in t2h 


6D8 (WARM/LCode ) 2E40: ihn lost a trnblk 


6E8 (WARM/Warm) 5CD4: bad ih queue structure 

6FO (WARM/LCode) 2CA2: HI bad packet length 

7CO (WARM/LCode ) 3066: Got an Out-of-range 

7C1 (WARM/LCode ) 2FBE: sending out-of-range 

7C2 (WARM/LCode) 31D2: no free rm bik 

7C4 (WARM/LCode) 29B6: dest died in hi 

705 (WARM/LCode) 2FB8A: sending duplicate reply 
7C6 (WARM/LCode ) S1E6: received duplicate Get-a-block 
707 (WARM/DDTCode ) 4EB4: sending incomplete query 
7C8 (WARM/LCode) 2F62: Sending incomplete reply 
7CA (WARM/LCode) 3340: no allocate for i-pkt msg 
FC8 (WARM/LCode) 29E8: host sent error with id 
FDO (WARM/LCode) 3G6AE: nal gone negative 

FD8 (WARM/LCode) 3494: Illegal rstate/type 

FEO (WARM/LCode) 398C: Back bypassing allocate 
FE1 (WARM/LCode ) 3988: Back can’t back up 
Lock (Source/Page) Label: Description 
AO82 (STAGEK/Vars) wmlock: memory test lock 
AOBSE (STAGEK/Vars) siflk: locked copy (+2) of silfptr 
AOS8 (STAGEK/Vars) memory discovery consensus 
AOAA (STAGEK/Vars) Common Kernel Discovery Consensus 


A25E (DDT/Vars) d2fl: 
A260 (DDT/Vars) f2d1: 
A262 (DDT/Vars) ttylok: 


A264 (DDT/Vars) ddtiok: 


406 


407 


4089 


40A 


40B 


40C 


40D 


40E 


413 


414 


415 


420 


421 


4Ci 


4C8 


500 


503 


504 


505 


506 


507 


508 
I 


509 


5OA 


(MODEM/Warm) 


(ROUTE/Warm) 


(CONFIG/Re1Code ) 


(MODEM/LCode) 


(MODEM/LCode) 


(UPDWN/Warm) 


(MODEM/Warm) 


(UPDWN/Warm) 


(MODEM/LCode ) 


(MODEM/LCode) 


(MODEM/LCode) 


(MODEM/LCoade) 


(MODEM/LCode ) 


(UPDWN/Warm ) 


(TASK/LCode) 


(TASK/LCode) 


(MODEM/LCode ) 


(FAKREL/DDTCode) 


(ROUTE/Warm) 


(ROUTE/LCode ) 


(ROUTE /Warm) 


(ROUTE/Warm) 


(ROUTE/Warm) 


(ROUTE/Warm) 


(ROUTE/Warm) 


(ROUTE /Warm) 


(ROUTE/Warm) 


47FO: 


4C92: 


5AD2: 


21CC: 


1C76: 


4670: 


49C4: 


460C: 


206C: 


1DGE: 


1DA2: 


1DAA: 


2202: 


44F4: 


236C: 


2344: 


1FF8: 


DGGE: 


5348: 


22FC: 


5532: 


5544: 


5562: 


57D2: 


58D4: 


58F6: 


541A: 


IQ2MFLD: 


RUTSPF : 


Bad checksum in routing update 


bad update checksum 


MTEST: scrambled modem parameter block 


bad sentgq 


T2M: lost SNDING buffer 


KNMISS: 


M2THIHY: 


PHDEDL: 


master line died 


Slave obeys master down 


Slave missed k in a row 


modem software checksum failure 


broken cksum on retransmission 


64 retransmissions: killed line 


32 retransmissions: discard packet 


DOAK: unexpected ack 


PHDEDL: 


Excessive Hardware Checksum Errors 


TASK: no route for packet 


TASK: f1 


ushing pkt with discard bit 


filling buffer error 


modem state mismatch 


SPFERR: 


RUPWHC : 


RUPFLS: 


RUPFLS: 


RUPFLS: 


RTRGEN: 


RUPQCK: 


RUPQCK: 


RUPENQ: 


SPF error forced restart 


routing queue broken 


buffer no longer owned by routing 


caller’s bit not on 


rupq buffer missing 


retransmission with bad length or IMP 


rupgqet wrong 


recovered unused buffer 


queuing packet for no one 


555 (ROUTE/Warm) 545A: CHKRQ: queue count too large 


557 (ROUTE/Warm) 545A: SPF accounting information 

5C2 (MODEM/LCode) 20E2: M2IREG: suddenly looped line 

5C3 (TASK/LCode) 2396: TASK: flushing packet for dead IMP 
5C5 (MODEM/Warm) 49A2: M2IHIHY: master/slave mismatch 

5C6 (MODEM/Warm) 493A: M2IHIHY: Neighbor IMP number changed 
5C8 (MODEM/LCode) 20EE: M2IREG: accepting pkt on dead line 
5DO (UPDWN/Warm) 4762: LINEUP: line up, r7=neighbor 

GOO (WARM/LCode) 3278: No message for inecm or incg 

602 (WARM/LCode) 2968: host input got a quit 

603 (WARM/LCode) 2AEG6G: host input quit in leader 

604 (LOCAL/LCode) 1800: IH: host output got a quit 

605 (WARM/LCode) 338C: no reas block for allocated 8-pkt msg 
606 (WARM/LCode ) 3252: no allocate to give back 

607 (WARM/LCode) 326A: incg or inem with gvb, but no alloc to gb 
608 (WARM/LCode) SS2ZE% rstate violation 

6O0A (WARM/LCode) 3174: reply lost-no space 

60B (WARM/Warm) SCFC: HSIOUT: Start pointer write failed 
611 (WARM/LCode) 2D24: illegal message blk in hi 

619 (FAKREL/FakCode) 5662: ihwq is a mess 


61A (CONFIG/Re1Code) 5986: BLDHST: BASE/MBLKS wrong for HI/IH 


628 (CONFIG/Rel1Code) 5B56: scrambled host parameter block 


640 (WARM/LCode) 2FDA: Block error, no recovery 

641 (WARM/LCode) 2FE4: Block error, trying recovery 
642 (WARM/LCode) 2FFO: No trnblk for allocate 

643 (WARM/LCode ) 3096: no trnbik for RFNM or dead RFNM 


644 (WARM/Warm) 5BB4: res rep when not resetting 


10A (FASTTO/Warm) 43A6: no working backup RTC 


10B (FAKREL/FakCode) 56F2: CONCLK: IMP number invalid 

10C (CONFIG/Re1Code) 5846: RTCCHK: RTC gone away 

10D (FAKREL/KakCode ) 6E1D: Node heat prob/or lost bus 

110 (CONFIG/Re1Code) S5A98: MTEST: modem hardware gone away 

111 (CONFIG/Re1Code) SAEQ: HOTEST: host hardware gone away 

112 (CONFIG/Re1Code) 5B8C: TST2DEV: spare interface disappeared 
114 (CONFIG/Re1Code) SAA2: MTEST: swapping modem interfaces 

115 (CONFIG/RelCode) B5AEA: HOTEST: swapping host interfaces 

203 (CONFIG/Re1Code) SAOA: DEVINUSE: over 2 interfaces, one device 
204 (CONFIG/Re1Code) S74A: RELCON: removed bad BASE dispatch 

205 (CONFIG/Re1Code ) 5AQ2: DEVINUSE: PID for doubled interface differs 
206 (CONFIG/Re1Code) S593E: CMODEM: BASE/MBLKS wrong for M2I/I2M 
207 (CONFIG/Re1Code) SAGE: BLDBLK: dynamic blocks area full 

208 (FAKREL/LCode) 3COE: free list in loop 

209 (FAKREL/LCode) SCE: lost the free list 

20A (IMPSUB/LCode) 1458: FREGET: threw away free list tail 

263 (IMPSUB/LCode) 1354: map error in flush 

264 (IMPSUB/LCode) 14A4: map error in nwheom 

266 (IMPSUB/LCode) 156C: * map error in deque 

267 (IMPSUB/LCode) 1530: * map error in unpack 

281 (FAKREL/FakCode) 527E: BUFT: Recovered a timed-out buffer 
2A2 (IMPSUB/LCode) 158C: DEQUE: buffer ownership error 

2C2 (FAKREL/LCode) SBFE: free list buffer error--WHERE nonzero 
2C8 (LOCAL/LCode) 1678: ringc overflow in rstart 

2C9 (FAKREL/DDTCode) 550C: ring structure broken in timeout 


2E1i (IMPSUB/LCode) 1448: FREGET: free list error, non-zero where 


2E3 (IMPSUB/LCode) 1366: tried to flush non-buffer 


2E5 (IMPSUB/LCode) 1374: tried to flush non-owned buffer 

2FO (IMPSUB/LCode) 1598: fixed half-empty queue 

300 (VHA/DDTCode) 5648: VHAREL: finished VHALIS recomputation 
301 (VHA/DDTCode ) 556C: VHAREL: Detected VHA table error 

302 (VHA/LCode ) 3D40: IHVHA: No virtual address found 

303 (VHA/LCode) 3D6C: TSKVHA: No virtual address this source 
304 (VHA/DDTCode) 5544: VHAREL: Too many VHA numbers 

305 (VHA/DDTCode) 5594: VHAREL: VHA IMP number too big 

3CO (FAKREL/FakCode ) 5BDE: imp going down 

3C1 (FAKES/FakCcde) 485C: FH2: got setup for no dest 

3C2 (FAKES/FakCode) 4814: FH2: Flushing Reload Packet 

SFO (FAKSUB/FakCode) 4124: FDOZEW: Initialized Jam Fake Host 

3F1 (FAKSUB/FakCode) 4328: FWAITW: Initialized IMP Fake Host 

3F8 (FAKSUB/FakCode) 41BO: JAMLEADER: Host wanted a buffer 


3FS (FAKSUB/FakCode ) 41A6: JAMLEADER: no host block 


3FA (FAKSUB/FakCode) 43BC: SUCKLEADER: Host sending a buffer 


3FB (FAKSUB/FakCode) 4466: FSUCBUF: Host sending leader 
SFC (FAKSUB/FakCode) 4282: FJAMiB: Host wanted a leader 
3FD (FAKSUB/FakCode) 4278: FJAM1B: No host block? 
SFE (FAKSUB/FakCode) 4474: FSUCBUF: No host block? 


SFF (FAKSUB/FakCode) 448C: FSUCBUF: Bad buffer from IH 


401 (MODEM/LCode) 205A: modem bad end pointer 
402 (MODEM/LCode) 1F7E: modem input got a quit 
403 (MODEM/LCode) 2056: modem input too short 
404 (MODEM/LCode) 1CSA: modem output got quit 


405 (MODEM/LCode) 1F 14: I2MXMIT: Start pointer write failed 


10 


14 


12 


13 


14 


15 


16 


17 


18 


19 


1A 


iB 


1D 


+E 


Le 


20 


2 1 


22 


(STAGEK/Re1Code ) 


(STAGEK/LCode) 


(STAGEC/Re1Code ) 


(STAGEK/LCode ) 


(STAGEK/LCode ) 


(STAGEK/LCode ) 


(STAGEC/Re1Code) 


(STAGEC/Re1Code) 


(STAGEK/Re1Code) 


(STAGEK/LCode) 


(STAGEK/Re1Code ) 


(STAGEK/Re1Code) 


(STAGEK/LCode ) 


(STAGEK/LCode) 


(STAGEK/LCode) 


(STAGEK/LCode ) 


(STAGEK/Re1 Code) 


(STAGEK/Re1Code) 


(STAGEK/RelCode) 


(STAGEK/Re1 Code) 


(STAGEK/Re1Code) 


(STAGEK/Re1Code ) 


(STAGEK/Rel1Code) 


(STAGEK/Re1Code) 


(STAGEK/Re1Code) 


(STAGEK/LCode) 


(STAGEK/LCode ) 


41EE: 


DIE: 


5580: 


908: 


ESA: 


EGO: 


5O3E: 


S712: 


4264: 


8C8: 


4250: 


444E; 


670: 


ABE: 


4BE: 


58A: 


49D4: 


45FC: 


44FO: 


4680: 


484A: 


48DC: 


4902: 


43F2: 


49EC: 


564: 


55C: 


No PIDs in system 
adjusted comrel 
Jiffy clock stopped 
WSLEEP: System missed a tick 
Quit in cksum parameters 
quit during checksumming 
Stage LC: Local code checksum broken 
Stage MM: Not Enough Memory 

Stage variables area quit 

WSLEEP: Lost our communications page 
Stage Common Reinitialization 

Stage CD: Memory Coupler End QUIT 
hung on bad lock 

Stage LK: can’t find a clock 
remote power fail interrupt 
Can’t find an RTC 

BBC processor started 

Buddy Processor started 

Stage CD: Bad Processor Identity 
Biock Transfer Timeout 

BLT proc not in table 
Non-existant proc in bit?? 
No I/0 bus for BBC 

Stage CD: One of my Couplers 
BBC transfer failure 

Power restore 


interrupt 


Local Power Fail Interrupt 


is broken 


23 


24 


25 


26 


27 


28 


2A 


2B 


2C 


2D 


2F 


40 


44 


42 


43 


44 


45 


50 


co 


100 


101 


102 


102 


103 


104 


108 


109 


(STAGEK/LCode ) 
(STAGEK/Re1Code) 
(STAGEC/Re1Code) 
(STAGEC/Re1Code) 
(STAGEC/Re1Code) 
(STAGEK/LCode ) 
(STAGEC/Re1Code ) 
(STAGEK/LCode ) 
(STAGEC/Re1Code) 
(STAGEC/Re1 Code) 
(STAGEC/Re1Code) 
(XSIOIN/LCode) 
(XSIOIN/LCode) 
(OPSYS/LCode) 
(OPSYS/LCode) 
(OPSYS/LCode) 
(OPSYS/LCode ) 
(STAGEC/Re1Code) 
(DDT/DDTCode ) 
(FAKREL/FakCode ) 
(FAKREL/FakCode ) 
(CONF IG/Re1Code) 
(FAKREL/FakCode) 
(CONF IG/Re1Code ) 
(STO/LCode) 
(FASTTO/Warm) 


(FASTTO/Warm) 


56E: 


4228: 


5iDE: 


5SE6: 


SEE: 


B3c: 


5558: 


478: 


5544: 


556A: 


537A: 


FOO: 


EFS: 


106C: 


110A: 


10C2: 


1286: 


5658: 


416A: 


5ABO: 


B24E: 


57E4: 


5SOA: 


5BB7A: 


IB7A: 


4384: 


A35SE: 


illegal level 4 interrupt 

Stage variables memory failure 

Stage MM: Spare page checksum differs 
fixed bad memory parity 

solid memory parity error 

SMD: no useable common memory 
QUIT(s) in QUIT handler 
Quit on instruction fetch 

Quit retry(ies) succeeded 

RTC read retry(ies) succeeded 
Stage MM: Copy/clear failed 
XSIOIN: Start pointer write failed 
XSIOIN: 4 start pointer failures 
got illegal pid value 

LOOP: map error 
LOOP: LSTACK overflow 
INBASE failed 

SARWDG: 
TTYINI: tty changed psbs 


IMP reinitialization 


smashed a buffer pointer 


RELCON: Changing buffer page allocation 


FAKINI: stopped pktcore 
TST2DEV: Swapping to F device 
STO: lock timed out 


main clock has stopped 


backup clock working again 


software watchdog timer expired 


Snaps: 


R4 - address of parameter block 


R6 - address of the buffer (through MAP2) 
(map in UMAP area) 


Ri - length that the hardware said it gave us. 


--FLUSH INPUT AND RETRY. RESETS INTERFACE 


404 Modem output got a quit 


The modem output status register reported a quit. 
Snaps: 

R3 - output status register we read 

R4 - modem parameter block address 


R5 - device interface address 


~-CONTINUES WITH NEXT INPUT 


405 Start Pointer Write Failed 
Snaps: 
R4 - start of modem parameter biock for the offending 


modem (or pair). 


--FLUSH INPUT 
410 Modem software checksum failure 
The systems software detected a checksum error on a packet 
(after the packet apparently was accepted by hardware 
checksum logic). This could indicate a software problem, but 


most likely is the result of a bad DMA card in the modem 


interface. 


5C5 Master Slave Mismatch. 


Usually caused when lower number IMP on other end of line 


fails to hear our "HELLO" packets. It could then be our 


modem output failing or the other IMP’s modem input failing. 


Snaps: 


R4 - Points to the modem parameter block. 


602 Host input got a quit 


Host input detected a quit during the input of data. 


Snaps: 


R1i- receive status 


R4 - host parameter block 


R5 - host interface address 


--RETURNS "ERROR DURING DATA" MESSAGE TO HOST 


603 Host input quit in leader 


The hardware reported a quit while we were reading in the 


leader. 


Snaps: 


R2 - receive status 


R4 - host parameter block 


R5 - host interface address 


--RETURNS "ERROR IN LEADER" MESSAGE TO HOST 


604 Host output got a quit 


6A0 


FC8 


The host output hardware reported that it got a quit. The 


host is reset, and host ready line flapped to indicate the 


failure. 


Snaps: 


R3 - transmit status 


R4 - host parameter biock 


RS - host interface address 


--FLAPS HOST READY LINE AND RESETS SOFTWARE 


Error during host input data 


This says the error bit came on in the host receive end 


pointer (bit ©). The usual cause is either the host or the 


IMP dropping its ready line while active. 


Snaps: 


R7 - end pointer we got 


R4 - parameter block of host 


R5 - interface hardware address 


~-~FLUSHES THE MESSAGE AND GIVES HOST AN "ERROR DURING DATA" 


MESSAGE. 


Host sent error with ID 


This says that the host computer thinks his ready line (imp 


ready) flapped while he was reading his leader in. Although 


this is specified in 1822, it is very unlikely that any real 


hosts will do it. 


9.3 TRAP LOCATIONS FOR IMP <1200> (NOT NECESSARILY TRUE FOR 


PSE) 


The following is a list of Pluribus’ traps. For IMPS on the 


PLATFORM the hex trap number is reported to the NMC. For machines 


not on the net, the traps are displayed at the bottom of the 


terminal. When the Pluribus times out a software Tock, 


its address is reported as if it were a trap. The locks are 


listed following the traps. Note - some locks are contained in 


dynamically-allocated parameter biocks; thus, their addresses 


depend on the individual machine configurations. If you need to 


find out what dynamic lock has timed out, ask for help from a 


software person. 


The names in parentheses below are the source file name and the 


logical page. Tne logical page information is used for obtaining 


the correct common memory page if you need to patch a trap for 


any reason. (see Section 2.1.) 


Page Trap (Source/Page) Loc: Description 
Trap (Source/Page) Loc: Description 

P (STAGEK/LCode ) 426: Unexpected Quit 

2 (STAGEK/LCode) 5CE: program in a loop 


3 (STAGEC/RelCode) 5340: Stage MM: Completed memory management 


4 (STAGEK/LCode ) 58E: local clock stopped 
5 (STAGEK/LCode) A76: Local Kernel Checksum Broken 
6 (STAGEK/LCode) 9DC: unexpected interrupt 


7 (STAGEK/RelCode) 44B2: Stage CD: BBC map failure 


